A Robust Boosting Method for Mislabeled Data
نویسندگان
چکیده
Abstract We propose a new, robust boosting method by using a sigmoidal function as a loss function. In deriving the method, the stagewise additive modelling methodology is blended with the gradient descent algorithms. Based on intensive numerical experiments, we show that the proposed method is actually better than AdaBoost and other regularized method in test error rates in the case of noisy, mislabeled situation.
منابع مشابه
Boosted Noise Filters for Identifying Mislabeled Data
In many practical classification problems, mislabeled data instances (i.e., class noise) exist in the acquired (training) data and often have a detrimental effect on the classification performance. Identifying such noisy instances and removing them from training data can significantly improve the trained classifiers. One such effective noise detector is the so-called ensemble filter, which pred...
متن کاملOn Boosting and Noisy Labels
Boosting is a machine learning technique widely used across many disciplines. Boosting enables one to learn from labeled data in order to predict the labels of unlabeled data. A central property of boosting instrumental to its popularity is its resistance to overfitting. Previous experiments provide a margin-based explanation for this resistance to overfitting. In this thesis, the main finding ...
متن کاملKernel Based Detection of Mislabeled Training Examples
The problem of identifying mislabeled training examples has been examined in several studies, with a variety of approaches developed for editing the training data to obtain better classifiers. Many of these approaches involve applying an individual or an ensemble of classifiers to the training set and filtering the mislabeled examples based on their consistency with respect to the classifier’s ...
متن کاملExperiments on Ensembles with Missing and Noisy Data
One of the potential advantages of multiple classifier systems is an increased robustness to noise and other imperfections in data. Previous experiments on classification noise have shown that bagging is fairly robust but that boosting is quite sensitive. Decorate is a recently introduced ensemble method that constructs diverse committees using artificial data. It has been shown to generally ou...
متن کاملActive cleaning of label noise
Mislabeled examples in the training data can severely affect the performance of supervised classifiers. In this paper, we present an approach to remove any mislabeled examples in the dataset by selecting suspicious examples as targets for inspection. We show that the large margin and soft margin principles used in support vector machines (SVM) have the characteristic of capturing the mislabeled...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004